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Abstract. In a spiked population model, the population covariance matrix has all its eigenvalues equal to units 
except for a few fixed eigenvalues (spikes). This model is proposed by Johnstone to cope with empirical findings 
on various data sets. The question is to quantify the effect of the perturbation caused by the spike eigenvalues. A 
recent work by Baik and Silverstein establishes the almost sure limits of the extreme sample eigenvalues associated 
to the spike eigenvalues when the population and the sample sizes become large. This paper establishes the limiting 
distributions of these extreme sample eigenvalues. As another important result of the paper, we provide a central 
limit theorem on random sesquilinear forms. 

Resume. Dans un modele de variances heterogenes, les valeurs propres de la matrice de covariance des variables 
sont toutes egales a l'unite sauf un faible nombre d'entre elles. Ce modele a ete introduit par Johnstone comme 
une explication possible de la structure des valeurs propres de la matrice de covariance empirique constatee sur 
plusieurs ensembles de donnees reelles. Une question importante est de quantifier la perturbation causee par ces 
valeurs propres differentes de l'unite. Un travail recent de Baik et Silverstein etablit la limite presque sure des 
valeurs propres empiriques extremes lorsque le nombre de variables tend vers l'infini proportionnellement a la taille 
de l'echantillon. Ce travail etablit un theoreme limite central pour ces valeurs propres empiriques extremes. II est 
base sur un nouveau theoreme limite central pour les formes sesquilineaires aleatoires. 

MSC: Primary 62H25; 62E20; secondary 60F05; 15A52 

Keywords: Sample covariance matrices; Spiked population model; Central limit theorems; Largest eigenvalue; Extreme 
eigenvalues; Random sesquilinear forms; Random quadratic forms 



1. Introduction 

It is well known that the empirical spectral distribution (ESD) of a large sample covariance matrix converges 
to the family of Marcenko-Pastur laws under fairly general condition on the sample variables [3, 10]. On the 
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other hand, the study of the largest or smallest eigenvalues is more complex. In a variety of situations, the 
almost sure limits of these extreme eigenvalues are proved to coincide with the boundaries of the support of 
the limiting distribution. As an example, when the sample vectors have independent coordinates and unit 
variances and assuming that the ratio p/n of the population size p over the sample size n tends to a positive 
limit y <G (0, 1), then the limiting distribution is the classical Marcenko-Pastur law F y (dx) 



where a y = (1 — y/y) 2 , and b y = (1 + yfy) 2 . Moreover, the smallest and the largest eigenvalue converge almost 
surely to the boundary a y and b y , respectively. 

Recent empirical data analysis from fields like wireless communication engineering, speech recognition or 
gene expression experiments suggest that frequently, some extreme eigenvalues of sample covariance matrices 
are well-separated from the rest. For instance, see Figs 1 and 2 in [9] which display the sample eigenvalues 
of the functional data consisting of a speech data set of 162 instances of a phoneme "del" spoken by males 
calculated at 256 points. As a way for possible explanation of this phenomenon, this author proposes a spiked 
population model where all eigenvalues of the population covariance matrix are equal to one except a fixed 
and relatively small number among them (spikes). Clearly, a spiked population model can be considered 
as a small perturbation of the so-called null case where all the eigenvalues of the population covariance 
matrix are unit. It then raises the question how such a small perturbation affects the limits of the extreme 
eigenvalues of the sample covariance matrix as compared to the null case. 

The behavior of the largest eigenvalue in the case of complex Gaussian variables has been recently 
studied in [7]. These authors prove a transition phenomenon: the weak limit as well as the scaling of the 
largest eigenvalue is different according to whether the largest spike eigenvalue is larger, equal or less than 
the critical value 1 + y/y. In [6] , the authors consider the spiked population model with general random 
variables: complex or real and not necessarily Gaussian. For the almost sure limits of the extreme sample 
eigenvalues, they also find that these limits depend on the critical values 1 + yfy and 1 — yfy from above and 
below, respectively. For example, if there are M eigenvalues in the population covariance matrix larger than 
1 + yfy, then the M largest eigenvalues from the sample covariance matrix will (almost surely) have their 
limits above the right edge b y of the limiting Marcenko-Pastur law. Analogous results are also proposed for 
the case y > 1 and y = 1 . 

An important question here is to find the limiting distributions of these extreme eigenvalues. As mentioned 
above, the results are proposed in [7] for the largest eigenvalue and the Gaussian complex case. In this 
perspective, assuming that the population vector is real Gaussian with a diagonal covariance matrix and 
that the M spike eigenvalues are all simple, [12] found that each of the M largest sample eigenvalues has a 
Gaussian limiting distribution. 

In this paper, we follow the general set-up of [6]. Assuming y £ (0, 1) and general population variables, we 
will establish central limit theorems for the largest as well as for the smallest sample eigenvalues associated to 
spike eigenvalues outside the interval [1 — yfy, 1 + yfy] . Furthermore, we prove that the limiting distribution of 
such sample extreme eigenvalues is Gaussian only if the corresponding spike population eigenvalue is simple. 
Otherwise, if a spiked eigenvalue is multiple, say of index k, then there will be k packed-consecutive sample 
eigenvalues X n ,i, ■ ■ ■ , \i,k which converge jointly to the distribution of a k x k symmetric (or Hermitian) 
Gaussian random matrix. Consequently in this case, the limiting distribution of a single X n j is generally 
non Gaussian. 

The main tools of our analysis are borrowed from the random matrix theory on one hand. For general 
background of this theory, we refer to the book [11] and a modern review by Bai [3]. On the other hand, we 
introduce in this paper another important tool, namely a CLT for random scsquilincar forms which should 
have its own interests. This CLT, independent from the rest of the paper, is presented in the last section 
(Section 7). 

The remaining sections of the paper are organized as follows. First in Section 2, we introduce the spiked 
population model and recall known results on the almost sure limits of extreme sample eigenvalues. The main 
result of the paper, namely a general CLT for extreme sample eigenvalues, Theorem 3.1, is then introduced 




(1.1) 
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in Section 3. To provide a better account of this CLT, Section 4 develops in details several meaningful 
examples. Several sets of numerical computations are also conducted to give concrete illustration of the 
main result. In particular, we recover a CLT given in [12] as a special instance. In Section 5, we discuss some 
extensions of these results to the case where spiked eigenvalues are inside the gaps located in the center 
of the spectrum of the population covariance matrix. Finally, Section 6 collects the proofs of the presented 
results based on a CLT for random sesquilinear forms which is itself introduced and proved in Section 7. 



2. Spiked population model and convergence of extreme eigenvalues 

We consider a zero-mean, complex-valued random vector x = (£ T ,r? T ) T where £ = (£(1), . . . ,£(M)) T , rj = 
(77(1) , . . . ,i](p)) T are independent, of dimension M and p, respectively. Moreover, we assume that E[||a;|| 4 ] < 
co and the coordinates of 77 are independent and identically distributed with unit variance. The population 
covariance matrix of the vector x is therefore 



V = cov(x) 



£ 
I P 



We consider the following spiked population model by assuming that £ has K non null and non unit 
eigenvalues ai, . . . , ax with respective multiplicity rii, . . . , nx (ni + • • •+riK = M). Therefore, the eigenvalues 
of the population covariance matrix V are units except the (oj), called spike eigenvalues. 
Let Xi = {£j,T]7) T be n copies i.i.d. of x. The sample covariance matrix is 



I n 

II — J 



XiX^ , 

n *■ — ' 

i=l 

which can be rewritten as 

a _(Su S 13 \_( X 1 X* 1 M 2 * \ _ 1 /£ E \ (9U 
bn \S 21 S 22 )~\X 2 Xl X 2 X*J n\EVi$ EOT,*j' 1 ' 

with 

X\ = — 7=(£l, . . . ,£n)Mxn = — 7=£l:nj X 2 = —=(r)i, . . . ,r) n ) pxn = —=r)i :n . 

It is assumed in the sequel that M is fixed, and p and n are related so that when n — > 00, p/n — > y £ (0, 1). 
The ESD of S n , as well as the one of ^22, converges to the Marcenko-Pastur distribution F y (dx) given in 
(1.1). As explained in the Introduction, a central question is to quantify the effect caused by the small 
number of spiked eigenvalues on the asymptotic of the extreme sample eigenvalues. 

As a first general answer to this question, Baik and Silverstein [6] completely determines the almost sure 
limits of largest and smallest sample eigenvalues. More precisely, assume that among the M eigenvalues of 
E, there are exactly greater than 1 + ^fy and M a smaller than 1 — yfy: 

ai > ■ ■ ■ > a Mb > 1 + y/y, a M < ■ ■ ■< a M -M b +i < 1 - y/y, (2.2) 
and 1 — y/y < au < 1 + y/y for the other a^'s. Moreover, for a / I, we define the function 

\ = Ma)=a+^-. (2.3) 

a — 1 

As y < 1, we have p <n for large n. Let 

be the eigenvalues of the sample covariance matrix S n . Let Si = ni + • • • + n,; for 1 < i < Mf, and tj = 
flu + • • • + rij for 1 < j < M a (by convention sq = <o = 0). 
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Therefore, Baik and Silvcrstein [6] proves that for each k € {1, . . ., M},} and Sfc_i < j < Sfe (largest eigen- 
values) or k g {1, . . . , M a } and p — tk < j <p — tk-i (smallest eigenvalues), 

y&k 

X n j — > 4'{ a k) = a k H r, almost surely. (2-4) 

"fe — 1 

In other words, if a spike eigenvalue afc lies outside the interval [1 — y/y, 1 + -Jy\ and has multiplicity nk, 
then 4>(ak) is the limit of rik packed sample eigenvalue {\ n ,j,j € Jfc}- Here we have denoted by Jk the 
corresponding set of indexes: Jk = {sk-i + 1, . . . , Sk} for ak > 1 + and Jk = {p — 4fe + 1, • • -,p — £fc-i} for 
a* < 1 - Vy- 



3. Main results 

The aim of this paper is to derive a CLT for the n^-packed sample eigenvalues 
Vn[Xnd -0(afc)], jeJfc, 

where <^ [1 — ^/y, 1 + ^/y] is some fixed spike eigenvalue of multiplicity n*. The statement of the main 
result of the paper, Theorem 3.1, needs several intermediate notations and results. 

3.1. Determinant equation and a random sesquilinear form 

By definition, each X n j solves the equation 

= |AJ- S n \ = \XI - S 22 \\XI - K n (\)\, (3.1) 

where 

K n (\) = S u + S 12 {XI - S 22 )-\S 21 . (3.2) 

As when n — > oo, with probability 1, the limit X n j — > 4>(ak) [a y ,b y ] and the eigenvalues of 6*22 go inside 
the interval [a y , b y ] , the probability of the event Q n 

Qn = {A„j ^ [a y , b y ]} n {spectrum of 522 C [a y , b y ]} 

tends to 1. Conditional on this event, the (A„.j)'s then solve the determinant equation 

\XI-K n (X)\=0. (3.3) 

Therefore without loss of generality, we can assume that A n j ^ [a y , b y ] and they are solutions of this equation. 
Furthermore, let 

A n = (a lJ )=A n (\)=X* 2 (\I-X 2 X* 2 r 1 X 2 , \i[a y ,b y ]. (3.4) 

Lemma 6.1 detailed in Section 6.1 establishes the convergence of several statistics of the matrix A n . 
In particular, n~ 1 trA n , n~ 1 trA n A^ and n~ l J2i=i a ii converges in probability to ymi(A), ym 2 {\) and 
(y[l + mi(A)]/{A — y[l + ?tii(A)]}) 2 , respectively. Here, the rrij(X) are some specific transforms of the 
Marcenko-Pastur law F y (see Section 6.1 for more details). 

Therefore, the random form K n in (3.2) can be decomposed as follows 

K n (X) = Su + X x A n X\ = + A n )g. n 

n 

= -Ui:«(/ + A n )C Vn - £tr(I + A n )} + -Six(I + A n ) 
n ' n 

= -±=Rn + [1 + ymi(\)]E + o P ( ^= J , (3.5) 
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with 

Rn=RnW = 4={&:n(J + 4i)£ n ~ Str(I + A n )}. (3.6) 

In the last derivation, we have used the fact 

- tr(J + A n ) = 1 + ymi (A) + o P ( -= 
n \w n 

which follows from a CLT for tr{A n ) (see [4]). 

3.2. Limit distribution of the random matrices {R n (X)} 

The next step is to find the limit distribution of the sequence of random matrices {i?„(A)}. The situation is 
different for the real and complex cases. Define the constants 

6 = l + 2ym 1 (X)+ym 2 (X), (3.7) 

. = l + 2^A) + ( /^ A M 2 . (3.8) 
\\ - y[l + mi(\)\ J 

Proposition 3.1 (Limiting distribution of R rl (X): real variables case). Assume that the variables 
£ and rj are real-valued. Then, the random matrix R n converges weakly to a symmetric random matrix 
R = (Rij) with zero-mean Gaussian entries having the following covariance function: for 1 < i < j < M , 
1 < i' < j' < M 

cov^,^) = ^m(imww')] - ZijZi'f] + ^{mimfwwm)]} 

+ {0-u,){E[t(W)]E[ZtiW)]}' (3-9) 
Note that in particular, the following formula holds for the variances 

var(i? ?J ) = 6(£ u £ 31 + 2*) + w{E[£ 2 (i)£ 2 (j)] - 2E% - (3.10) 
In case of a diagonal element Ru, this expression simplifies to 



var(i? ?l ) = [29 + p t u] E 2 U , with ft = E ^ l p _ 
If moreover, is Gaussian, ft = 0. 

Remark 1. If the coordinates of £ are independent, then the limiting covariance matrix in (3.9) is 

diagonal: the limiting Gaussian matrix is made with independent entries. Their variances simplify to (3.11) 
and 

YBx{R ij ) = eE ii E jj , i<j. (3.12) 

Proposition 3.2 (Limiting distribution of R n (X): complex variables case). Assume the general 
case with complex-valued variables £ and rj and that the following limit exists 

m 4 (A) =lim-trA„A^, \<£{a y ,b v }. (3.13) 

n n 
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Then, the random matrix R n converges weakly to a zero-mean Hermitian random matrix R — (Rij ) . More- 
over, the joint distribution of the real and imaginary parts of the upper-triangular bloc {Rij, 1 < i < j < M } 
is a 2K -dimensional Gaussian vector with covariance matrix 

r -(£ £)• (3i4) 



where 

1 3 



3=1 
3 



^=iD- 2Sl ( fl j)+ B i-+%}. 

3=1 



3 = 1 

and forl<i<j< M and l<i' <f < M, 

=«(EK(i)f(j)f(< / )e(j")] - ^«^i'3') ; 

B 3 (ij,i'f) = (T-cuKmm'maiW)}), 
B la (ij,i'j')=u(E{\a*)m\ 2 }-^^'), 

B lb (ij,i'j')=u(E[\!;(jW)\ 2 ] - XjjXj'j'): 

B 2a (i3,i'j') = {e-u)\Z ll ,\ 2 , 
B 2b (ij,i'j') = (6-u)\£ jj ,\ 2 , 

B 3a ^ 3 , i 'j') = (T-u J )\Emam 2 , 

Bs b (ij,i'f) = (r-u;)\E[i(j)af)}\ 2 . 
Here, the constant r equals 

t = lim-tr(7 + A„)(/ + A n ) T = l + 2ymi(A) +m 4 (A). (3.15) 
n n 

The limiting covariance matrix -T has a complicated expression. However, the variance of a diagonal 
element Ru has a much simpler expression if moreover, E[£ 2 (i)] = for all 1 < i < M, 



var{R ii ) = [6 + P' i u;}Z? i , with # = ^^-i - 2. (3.16) 



In particular, if is Gaussian, (3[ = 0. 
3.3. CLT for extreme eigenvalues 

In order to introduce the main result of the paper, let the spectral decomposition of S, 
'ail ni ■■■ 

o ■•• 

UkIuk 
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where U is an unitary matrix. Following Section 2, for each spiked eigenvalue a k [1 — y/y, 1 + y/y\, let 
{X n j,j € Jfc} be the packed eigenvalues of the sample covariance matrix which all tend almost surely to 
X k = 4>{ot k ). Let i?(Afe) be the Gaussian matrix limit of the sequence of matrices of random forms [R n {X k )] n 
given in Proposition 3.1 (real variables case) and Proposition 3.2 (complex variables case), respectively Let 

R(\ k )=U*R{\ k )U. (3.18) 

Theorem 3.1. For each spike eigenvalue a k ^ [1 — yfy, 1 + y/y], the n k - dimensional real vector 

Vn{X n ,j - Afc, j G Jfc}, 

converges weakly to the distribution of the n k eigenvalues of the Gaussian random matrix 

— tt—. — Rkk(X k ) 7 

1 + ym 3 {X k )a k 

where R k k{X k ) is the kth diagonal bloc of R(X k ) corresponding to the indexes J k }. 

One striking fact from this theorem is that the limiting distribution of such n k packed sample extreme 
eigenvalues are generally non-Gaussian and asymptotically dependent. Indeed, the limiting distribution of a 
single sample extreme eigenvalue A raj is Gaussian if and only if the corresponding population spike eigenvalue 
is simple. 



4. Examples and numerical results 

This section is devoted to describe in more details the content of Theorem 3.1 with several meaningful 
examples together with extended numerical computations. 

4-1. A special Gaussian case from Paul [12] 

We consider a particular situation examined in [12]. Assume that the variables are real Gaussian, S diago- 
nal whose eigenvalues are all simple. In other words, K = M and n k = 1 for all 1 < k < M. Hence, U = 1m- 
Following Theorem 3.1, for any X k = <fi(a k ), with a k £ [1 + y/y] \fn{Xn,k — A&) converges weakly to the Gaus- 
sian variable (1 + ym,3(X k )a k )~ 1 R(X k ) kk . This variable is zero-mean. For the computation of its variance, 
we remark that by Eq. (3.11) 

va,rR(X k ) kk = 26a 2 k , 

where 

6 = 1 + 2ym 1 (X k ) + ym 2 (X k ) = ^"^t^ - 

(a k - iy-y 

Taking into account (6.6), we get finally, for 1 < k < M 

/— ri x v(n 2 \ 2 2^[(q fc -l) 2 -;/] 

Vn{X n , k -X k )=+,yV(Q,o- a J, a ak = — — ^ . 

\p. k JJ 



This coincides with Theorem 3 of [12]. 
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4-2. More general Gaussian variables case 

In this example, we assume that all variables are real Gaussian, and the coordinates of £ are independent. 
As in [6], we fix y = 0.5. The critical interval is then [1 — ^/y, 1 + ^/y] = [0.293, 1.707] and the limiting support 
[a y ,by] = [0.086,2.914]. 

Consider K = 4 spike eigenvalues (cti, a%, 013, a 4) = (4, 3, 0.2, 0.1) with respective multiplicity (n.1,712, n^, n^) 
(1,2,2,1). Let 

A n ,l > A n ,2 > ^n,3 an d XnA > A nj 5 > A„ 6 

be, respectively, the three largest and the three smallest eigenvalues of the sample covariance matrix. Let, 
as in Section 4.1, 



2 _2al [{a k -lf -y] 
K - l) 2 



a a k ^ — — • (4-1) 



We have {a 2 ak ,k= 1,...,4) = (30.222, 15.75, 0.0175, 0.00765). 

Following Theorem 3.1, taking into account Section 4.1 and Proposition 3.1, we have 

• for j = 1 and 6, 

s n ,j = MKj - ^ ^(o><)- ( 4 - 2 ) 

Here, for j = 1, k = 1, <^(ai) = 4.667 and ct^ = 30.222; and for j = 6, k = 4, (j)(a 4 ) = 0.044 and a 2 ai = 
0.00765; 

• for j = (2,3) or j = (4,5), the two-dimensional vector 5 n j = y/n[X n j — 4>(ak)] converges weakly to the 
distribution of (ordered) eigenvalues of the random matrix 



G = a a , 



W n W 12 
W12 W22 



Here, because the initial variables (£(«))'s are Gaussian, by Eqs (3.11) and (3.12), we have var(Wn) = 
var(H / 22) = I, var(Wi2) = \ - s so that (Wij) is a real Gaussian- Wigner matrix (with independent entries). 
Again, the variance parameter cr 2 fe is defined as previously but with k = 2 for j = (2,3) and k = 3 for 
j = (4,5), respectively. Since the joint distribution of eigenvalues of a Gaussian-Wigner matrix is known 
(sec [11]), we get the following (unordered) density for the limiting distribution of 5 n ,j: 



g(^T)= A } /- l^~7|cxp 



(5 2 + l 2 ) 



2< 



(4.3) 



Experiments are conducted to compare numerically the empirical distribution of the S n j 's to their limiting 
value. To this end, we fix p = 500 and n = 1000. We repeat 1000 independent simulations to get 1000 
replications of the six random variates {S n j,j = 1, . . . , 6}. Based on these replications, we compute 

• a kernel density estimate for two univariate variables 5 ni i and (5 nj 6i denoted by f n fi respectively; 

• a kernel density estimate for two bivariate variables (<5 n ,2, £ra,3) and (<5„.4, 8 n .5), denoted by / rai 23 /n,45 
respectively. 

The kernel density estimates arc computed using the R software implementing an automatic bandwidth 
selection method from [13]. 

Figure 1 compare the two univariate density estimates f n ,i and f n _e to their Gaussian limits (4.2). As we 
can see, the simulations confirm well the found formula. 

To compare the bivariate density estimates / n ,23 and f n .45 to their limiting densities given in (4.3), we 
choose to display their contour lines. This is done in Fig. 2 for /„,23 and Fig. 3 for f n ,45- Again we see that 
the theoretical result is well confirmed. 



Eigenvalues in a spiked population model 



455 




- 1 1 1 1 1 1 r 

-03 -at. -01 00 0.1 0! 0.3 



Fig. 1. Empirical density estimates (in solid lines) from the largest (top: ) and the smallest (bottom: fn.a) sample 

eigenvalue from 1000 independent replications, compared to their Gaussian limits (dashed lines). Gaussian entries with p = 500 
and n = 1000. 



4-3. A binary variables case 

As in the previous example, we fix y = 0.5 and adopt the same spike eigenvalues (a-i, a 2 , «3, ay) = 
(4,3,0.2,0.1) with multiplicities (m, n 2 , n 3 , n 4 ) = (1,2,2,1). Let the cr 2 fc 's be as defined in (4.1). Again 
we assume that all the coordinates are independent but this time we consider binary entries. To cope with 
the eigenvalues, we set 

^(i) = s/a^£i, n(j)=e' j , 

where (s^ and (s'j) are two independent sequences of i.i.d. binary variables taking values {+1,-1} with 
equiprobability. We remark that Ee l = 0, Eef = 1 and # = E[£ 4 (i)]/[E£ 2 (i)] 2 - 3 = -2. This last value 
denotes a departure from the Gaussian case. 

As in the previous example, we examine the limiting distributions of the three largest and the three 
smallest eigenvalues {X n ,j, j = 1, • • • ,6} of the sample covariance matrix. Following Theorem 3.1, we have 

• for j = 1 and 6, 

S n ,j = Vn[Xn tj - <j>(a k )] ^ JS(0,s 2 ak ), 



2 2 



' {a k - If 



Compared to the previous Gaussian case, as the factor yj (a k — l) 2 < 1, the limiting Gaussian distributions 
of the largest and the smallest eigenvalue are less dispersed; 

for j = (2,3) or j = (4,5), the two-dimensional vector 5 n j = \/n[X n j — (j>(cxk)] converges weakly to the 
distribution of (ordered) eigenvalues of the random matrix 



156 



Z. Bai and J.-F. Yao 




Fig. 3. Limiting bivariate distribution from the second and the third smallest sample eigenvalues. Top: Contour lines of the 
empirical kernel density estimates f n ^s from 1000 independent replications with p = 500, n = 1000 and Gaussian entries. 
Bottom: Contour lines of their limiting distribution given by the eigenvalues of a 2 X 2 Gaussian- Wigncr matrix. 
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Here, because the initial variables (£(z))'s are binary, hence f3i — —2 (which is zero for Gaussian variables), 
by Eqs (3.11) and (3.12), we have var(Wi 2 ) = \ but var(W u ) = var(W 2 2) = y/(a k - l) 2 . Therefore, the 
matrix W = (Wy) is no more a real Gaussian- Wigner matrix. Again, the variance parameter er 2 fc is 
defined as previously but with k = 2 for j = (2, 3) and k = 3 for j = (4, 5), respectively. Unfortunately and 
unlike the previous Gaussian case, the joint distribution of eigenvalues of W is unknown analytically. Wc 
then compute empirically by simulation this joint density using 10000 independent replications. Again, as 
y/ip-k — l) 2 < 1, these limiting distributions are less dispersed than previously. 

The kernel density estimates f n ,i, /n,6> /n,23 and f n ,ih are computed as in the previous case using p = 500, 
7i = 1000 and 1000 independent replications. 

Figure 4 compares the two univariate density estimates f Uj i and to their Gaussian limits. Again, we 
see that simulations confirm well the found formula. However, we remark a slower convergence rate than in 
the Gaussian case. 

The bivariate density estimates f n ,23 and / n ,45 are then compared to their limiting densities in Figs 5 
and 6, respectively. Again we see that the theoretical result is well confirmed. We remark that the shape 
of these bivariate limiting distributions is rather different from the previous Gaussian case. We remind the 
reader that the limiting bivariate densities are obtained by simulations of 10000 independent G matrices 
given in (4.4). 



5. Some extensions 



It is possible to extend the spiked population model introduced in Section 2 to a much greater generality. 
Let us consider a population p x p covariance matrix 



V = cov(a;) 



E 
T p 
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Fig. 5. Limiting bivariato distribution from the second and the third sample eigenvalues. Top: Contour lines of the empirical 
kernel density estimates f n ,23 from 1000 independent replications with p = 500, n = 1000 and binary entries. Bottom: Contour 
lines of their limiting distribution given by the eigenvalues of a 2 X 2 random matrix (computed by simulations). 




Fig. 6. Limiting bivariate distribution from the second and the third smallest sample eigenvalues. Top: Contour lines of the 
empirical kernel density estimates f n ,45 from 1000 independent replications with p = 500, n = 1000 and binary entries. Bottom: 
Contour lines of their limiting distribution given by the eigenvalues of a 2 X 2 random matrix (computed by simulations). 
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where E is as previously while T p is now an arbitrary Hermitian matrix. As M will be fixed and p — > oo, 
the limit F of the ESD of the sample covariance matrix depends on the sequence of (T p ) only. With some 
ambiguity, we again call the eigenvalues afc 's of E spike eigenvalues in the sense that they do not contribute 
to this limit. 

In the following, we assume for simplicity that E as well as T p are diagonal, and when p —> oo, the 
empirical distribution of the eigenvalues of T p converges weakly to a probability measure H(dt) on the real 
line. Therefore, the limit F of the ESD is characterized by an explicit formula for its Stieltjies transform, 
see [5]. 

The previous model of Section 2 corresponds to the situation where H(dt) is the Dirac measure at the 
point 1. A more involved example which will be analyzed later by numerical computations is the following. 
The core spectrum of V is made with two eigenvalues oj\ > L02 > 0, nearly p/2 times for each, and V has 
a fixed number M of spiked eigenvalues distinct from the w^'s. In this case, the limiting distribution H is 
^(5^ tJ1 j(dt) + S^ UJ2 j(dt)), a mixture of two Dirac masses. 

The sample eigenvalues {X n .j} are defined as previously. Assume that a spiked eigenvalue afc is "sufficiently 
separated" from the core spectrum of V , so that for some function tp to be determined, there is a point 
ip{a.k) outside the support of F to which converge almost surely packed sample eigenvalues {\ n ,j,j G Jk}- 
In such a case, the analysis we have proposed is also valid yielding a CLT analogous to Theorem 3.1: the 
rife-dimensional real vector 

Vn{X n ,j ~ ^{ak),j £ Jk} 

converges weakly to the distribution of the rik eigenvalues of some Gaussian random matrix. In particular, 
if nfe = 1, this limiting distribution is Gaussian. 

We do not intend to provide here all details in this extended situation. However, let us indicate how we 
can determine the almost sure limit ip( a k) of the packed eigenvalues. From the almost sure convergence 
and since ij){a,k) is outside the support of F, with probability tending to one, A n j solve the determinant 
equation (3.3). With A n = A 2 *(A7 - X 2 X^)- 1 X 2l we have 



for some eigenvalue a of £ . 

Let m(A) be the Stieltjies transform of the limiting distribution F and m(\) the one of yF(dt) + (1 — 
y)6{ }(dt). Clearly, Am(A) = —1 + y + yXm(X). Moreover, it is known that, see e.g. [5], 



K n (X) = S n + XUnX* = -&.„(! + A n )C v 



l:n 5 



which tends almost surely to [1 + ymi(X)]£ . Therefore, any limit A of a X„ t j fulfills the relation 



A - [l + ym 1 (X))a = 0, 



(5.1) 




(5.2) 



As toi(A) 



1 — Am(A) by definition, Eq. (5.1) reads as 



X = [1 — y — yAm(A)]a 



Am(A)o;. 



It follows then 1 + am(X) = (generally, A 7^ 0). Combining with (5.2), we get finally 




(5.3) 



In particular, for the original spiked population model with H(dt) = 5{i}(dt), we recover the relation given 
in (2.3). 
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15 3 4 5 

Eigenvalues on |> b| 

Fig. 7. An example of p = 500 sample eigenvalues (top) and a zoomed view on [0,5] (bottom). The limiting distribution of 
the ESD has support [0.395,1.579] U [4.784,17.441]. The four eigenvalues {A n j,l < A < 4} in the middle, related to spiked 
eigenvalues, are marked with a point. Gaussian entries with n = 2500. 



We conclude the section by giving some numerical results of the above mentioned example of an ex- 
tended spiked population model. Then, we consider (wi,^) = (1,10), (ai, 02,0:3) = (5,4,3) with respec- 
tive multiplicity (1,2,1), and the limit ratio y = 0.2. Note that these spiked eigenvalues are now be- 
tween the dominating eigenvalues (1 and 10). On the other hand, the support of the limiting distribu- 
tion of the ESD can be determined following the method given in [5], and we get two disjointed intervals: 
suppF = [0.395, 1.579] U [4.784, 17.441]. 

For simulation, we use p = 500, n = 2500 and the eigenvalues of the population covariance matrix V are 
1 (248), 3 (1), 4 (2), 5 (1) and 10 (248). We simulate 500 independent replications of the sample covariance 
matrix with Gaussian variables. An example of these 500 replications is displayed in Fig. 7. 

For each replication, the four eigenvalues at the middle (of indexes 249, 250, 251, 252) are extracted. Let 
us denote these 4 eigenvalues by A ra ,i, X n ,2, An. 3, A n ,4. By (5.3), we know that the almost sure limits of these 
sample eigenvalues are respectively 



tp(a k ) = a k 



1 + 1 



(4.125,3.467,2.721). 



10(a fc -l) a* -10 
The next Fig. 8 displays the empirical densities of 

S n ,j = Vn{X n ,j —1p(0Lk)), 1 < .7 < 4, 

from the 500 independent replications. The graphs of 8 n ,i and 8 n .i confirm a limiting zero-mean Gaussian 
distribution corresponding to single spike eigenvalues 5 and 3. On the contrary, the limiting distributions 
of 8 n ^2 and 8 n $, related to the double spike eigenvalue 4, are not zero- mean Gaussian. We note that <5„ 2 
and —5n,3 have approximately the same distribution. Indeed, their joint distribution converges to that of 
the eigenvalues of 2 x 2 Gaussian-Wigner matrix. 
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Fig. 8. Empirical densities of the normalized sample eigenvalues {S n j, 1 <j < 4} from 500 independent replications. Gaussian 
entries with p = 500 and n = 2500. 

6. Proofs of Propositions 3.1, 3.2 and Theorem 3.1 

Before giving the proofs, some preliminary results and useful lemmas are introduced. Note that these proofs 
are based on a CLT for random sesquilinear forms which is itself introduced and proved in Section 7. 

6.1. Preliminary results and useful lemmas 

For A ^ [a y , b y ], we define 



mi (A) 
m 2 (A) 
m 3 (A) 



x 



A — x 

x 2 
(X-x) 

x 



F y (dx), 



(X-x) 
It is easily seen that 
A 



2 F y (dx), 



(6.1) 
(6.2) 
(6.3) 



A — x 



F y (dx) = l+m 1 (\), 



A 2 



(x- x y 



= l + 2mi(A) +m 2 (A). 



If a real constant a ^ [1 — y fy 1 1 + ^/y], then (j)(a) ^ [a v , b y ] and we have 
mi o 4>(a) 



m 2 o <p(a) 



a-1' 

(a-l) + y(a + l) 
(a-l)[(a-l)*-yY 



(6.4) 
(6.5) 
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77i3 ° <p\p-} 



(a - l) 2 -y' 



(6.6) 



Let us mention that all these formulas can be obtained by derivation of the Stieltjies transform of the 
Marcenko-Pastur law F y (dx) 



m(z) 



1 



1 



2y 



Here, y/u denotes the square root with positive imaginary part for ugC. 

The following lemma gives the law of large numbers for some useful statistics related to the random 
matrix A„ introduced in Eq. (3.4). 



Lemma 6.1. We have 
1 p 

-tr^4 n — >ymi(\), 

77 

1 P 

- tiA n A* n — > ym 2 (X), 

77 



1 n 



2 p ( y[l+ 777l(A)] 



\-y[l + mi (X)} 



(6.7) 
(6.8) 

(6.9) 



Proof. Let f3 n ,j,j = 1, • • • ,P be the eigenvalues of S 22 = X 2 X 2 . The first equality is easy. For the second 
one, we have 

- tr A n A* n = - tr(AJ - x 2 x* 2 y x x 2 x* 2 (\i - x 2 x* 2 y l x 2 x* 2 

77 77 



-E 



PI 



nj^iX-^f 



x 



(X-x) 



2 Fy(dx). 



For (6.9), let ei € C" be the column vector whose ith element is 1 and others are and Xn denote the 
matrix obtained from X 2 by deleting the ith column of X 2 . We have X 2 = X 2 i + -r/irj* . Therefore, 

*Y*!\T Y Y'rV 1 *I\T Y Y*^ 1 nrf ( X ?i X 2i ~ 

an = e i X 2 {XI -X 2 X 2 ) X 2 e t = —r\ i (XI - X 2 X 2 ) m = — x ■ 



Using Lemma 2.7 of [5], 



E 



-77*(x 2i x 2 * - xi)~ x r)i --tr(x 2l x; t - xiy 1 



<^E\ V 0.)fEtr{X M X^-XI) \ 



which gives that 



yfl/(x-X)F y (dx) y[l + mi (X)] 



l + yJl/{x-X)F v (dx) X-y[l + mi (X)Y 

Further, it is easy to verify that 

tr(XUXI-X 2 XZ)- x X 2 ) 4 
lim E — - — — ^- < 00, 



(6.10) 



which implies, together with inequality 3.3.41 of [8] that 



supEa^ = sup — Ea^ < sup E 



tr(Xi(XI-X 2 Xi)- 1 X 2 ) 4 



< 00. 
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Therefore, the family of the random variables {an} indexed by n is uniformly intcgrablc. Combining with 
(6.10), we get 



E 



n 



al - 



y[l + mi(A)] 



< 



y[l + mi (X)} 
A-y[l + mi(A)] 



□ 



\-y[l + mi (X)} 

Thus (6.9) follows. 
6.2. Proof of Proposition 3.1 

We apply Theorem 7.1 by considering K = ^M(M + 1) bilinear forms 
u(i){I + A n )u(j) T , \<i<]< M, 

with 

u(i) = (&(»),•" ,&.(*))• 

More precisely, with 1= we are substituting u(i) T for X(£), and u(j) T for Y(£), respectively. Conse- 

quently, xn = £i(«) and yn = £i(j) for the application of Theorem 7.1. 
We have, by Lemma 6.1, 

9 = t = lim - tr(i + A n f = 1 + 2ym 1 (A) + ym 2 (A), 



to - 



1 ™ 

lim - y [(I + A n ) u f = 1 + 2ymi(A) 



y[l + mi(A)] 
A-|/[l + mi(A)] 



Following Theorem 7.1, i? n converges weakly to a symmetric random matrix with zero-mean Gaussian 
variables R = (Rij) with the following covariance function, assuming 1 < i < j ' < M, 

caviRi^Ri'j') = umdWWW)} - ^i-Si'i'} + (* - w){E[e(i)eO")]EK(i / )e(i)]} 

+ ((9- W ){EK(i)e(i')]E[fC7W)]}- (6-11) 

(5.3. Proof of Proposition 3.2 

The aim is to apply Theorem 7.3 to if = iM(M + 1) sesquilinear forms 
u(i)(I + A n )u(j)*, l<i<j<M, 

with 

u(i) = (&«,• ">£««)• 

More precisely, with £ = we are substituting u(i)* for X(£), and w(j)* for F(^), respectively. Conse- 

quently, xn = and yn = £i(j) for the application of Theorem 7.3. 
Again by Lemma 6.1, 



6 = lim - tr(i + A n ) 2 = 1 + 2ym 1 (X) + ym 2 {\), 



1 " 

lim - y [(I + A n ) u f = 1 + 2ymi(A) 



y[l + mi(A)] 
A-y[l + mi(A)] 
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Here we need an additional condition which is specific to the complex case. Assume therefore, 
r = lim-tr(/ + A n ){I + A n ) T = 1 + 2ymAX) + mAX). 

n n 

Consequently by Theorem 7.3, R„ converges weakly to a zero-mean Hermitian random matrix R = (Rij). 
Moreover, the joint distribution of the real and imaginary parts of the upper-triangular bloc {Rij, 1 < i < 
j < M} is a 2/f -dimensional Gaussian vector with covariance matrix 



r - 



Ai A 2 
Ai A2 



(6.12) 



where 



1 3 

i=i 



3=1 

and for 1 < i < j < M and 1 < i' < f < M, with the /3-matrices defined in the proposition. 
6.4. Proof of Theorem 3.1 

Let «fe ^ [1 — yJy, 1 + y/y] be fixed. Following Section 3.1, we can assume that the rife packed sample eigenvalues 
{X n ,j,j € </fc} are solutions of the equation |A — K n {\)\ = 0. As A n j — > Afe almost surely, we define 

= Vn(X n ,j — Afe). 

We have 

A„j/ - /^(An.j) = Afe/ + —=S n jI - A'„(Afc) - [-K"n(A n ,j) - ^n(^fc)]' 

v n 

Furthermore, using A^ 1 — B^ 1 = A~ l (B — A)B _1 , we have 



A n (A^ 



B- 1 


= A~ 


n 


n^2 < 



Afe H 7=<5n,j 



I-S- 



22 



(Xki-s^T 1 \X2C1: 



1 S l £ X* 

i— v n,j sl:n vl 2 



=(5„j[ym 3 (A fe )I7 + o P (l)]. 











( 


Afe A 




1 - S 22 ^j 



Combining these estimations and (3.5), (3.6), we have 

XnjI - A„(A„. J ) = Afe/ - [1 + ymi(X k )]S - -^R n (X k ) + -^=5 nJ [I + ym 3 (X k )S] + o P ( -L ) . (6.13) 

By Section 3.1, R n (Xk) converges in distribution to a M x M random matrix R(X k ) with Gaussian entries 
with a fully identified covariance matrix. We now follow a method devised in [1] and [2] for limiting distri- 
butions of eigenvalues or eigenvectors from random matrices. First, we use Skorokhod strong representation 
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so that on an appropriate probability space, the convergence R n (Xk) — > -R(Afc) as well as (6.13) take place 
almost surely. Multiplying both sides of (6.13) by U from the left and by U* from the right yields 



U[\ nj I-K n (X niJ )]U* = 










0^ 







'M- 


- [1 + i/mi(Afe )]a u )I nn 







\o 







'■■) 




1 


(■. 












8», 3 (l + ym 3 (\ k )a u 


)In u 














'■■) 



-.UR n {X k )U* 



First, in the right-hand side of the equation and using a bloc decomposition induced by (3.17), we see 
that all the non diagonal blocs tend to zero. Next, for a diagonal bloc with index u ^ k, by definition 
X k — [1 + ymi(X k )]a u 7^ 0, and this is the limit of that diagonal bloc since the contributions from the 
remaining three terms tend to zero. As X k — [1 + ymi(X k )]a k — by definition, the fcth diagonal bloc reduces 
to 

— ]=[U R n {X k )U*] kk + -i=S ntj (l + ym 3 (X k )a k )I nk + o ( -= ) . 
V n V n \ V n / 

For n sufficiently large, its determinant must be equal to zero, 

— —[URn{X k )U*} kk + -!=d ntj (l + ym 3 (X k )a k )I nk +o( — 

V n \J n \ V ' 

or equivalcntly, 

| - [UR n [X k )U*] kk + S n j (1 + ym 3 (X k )a k )I nk + o(l) | = 0. 



0. 



Therefore, 8 n j tends to a solution of 

\-[URn(X k )U*} kk + X(l+ym 3 (X k )a k )I nk \ = 0, 

that is, an eigenvalue of the matrix (1 + ym 3 (X k )a k )~ 1 R kk (X k ). Finally, as the index j is arbitrary, all the 
Jfc random variables \fn{X n ,j — X k ,j < J k } converge almost surely to the set of eigenvalues of the above 
matrix. Of course, this convergence also holds in distribution on the new probability space, hence on the 
original one. 



7. A CLT for random sesquilinear forms 

The aim of this section is to establish a CLT for random sesquilinear forms as one of the central tools used 
in the paper. These results are independent from the previous sections and should have their own interest. 

Consider a sequence {{xi, yi)i£]y} of i-i-d. complex- valued, zero-mean random vectors belonging to <C K x 
C K with a finite moment of the fourth-order. We write 

/ xu \ 

Xi = (xii)=l : , X(£) = (xn, . . . , xi n ) T , 1<£<K, (7.1) 

\XKi/ 

with a similar definition for the vectors {Y(£)}i<£<k- Set p(£) = E[a^ij/^i]. 
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Theorem 7.1. Let {A n = [aij(n)]} n be a sequence of n x n Hermitian matrices and the vectors 
{X(£), Y{i)}\<i<K a $ defined in (7.1). Assume that the following limits exist 



1 " 

= lim -VVjn), 



n— >oo 71 



1 1 2 

9= lim -trA^ = lim - \a uv (n)\ , 

n— >oc ft n—*oo Jl * — ' 

l l ™ 

r= lim — trA„A^= lim — > a^(n). 

n — >oo 77, n — >oo 77, ^ — ' 

u,v — 1 

Then, the M -dimensional complex-valued random vectors 

Z n = (Z nJ ), Z n , i = ^ ? =[X{iyA n Y(t)- P {l)trA n ], l<i<K, (7.2) 
v n 

converge weakly to a zero-mean complex-valued vector W whose real and imaginary parts are Gaussian. 
Moreover, the Laplace transform of W is given by 



Ee c w =cxp 



1 T 

-c T Bc 

2 



ceC K , (7.3) 



where the matrix B = B\ + Bi + B% with 

B 1 = w^xnynxfiym] - p(£)p(£')), 1 < l,t < K, 

B 2 = (8-cj)(E[x n y e , 1 }E[x e , 1 y n }), !<£,£' <K, (7.4) 
B 3 = (r - w)(E[xaxt>i]E\ytiVt>i]), !<£,£' <K. 

The proof of the theorem is postponed to the end of the section. First, we describe some specific applica- 
tions of the theorem with their own interest. Note that by definition, the three matrices Bj's are symmetrical 
(complex- valued) . 

Consider first the real variables case with i.i.d. random vectors {(xj, Viji^N} from M. K x M. K , and a 
sequence of symmetric matrices {A n = [aij(n)]} n . We are then considering K random bilinear forms and 
consequently, 8 = r. The matrix B given above is then exactly the limiting covariance matrix of the Gaussian 
vector W. 

Corollary 7.1. Under the same conditions as in Theorem 7.1 but with real random vectors {{xi, y^i^pi} and 
symmetric matrices {A„}„, the sequence of vectors {Z n ) n converges weakly to a zero-mean K -dimensional 
Gaussian vector with covariance matrix B . 

An interesting application to the case {xi) = (yi) gives the following CLT for random quadratic forms in 
a straightforward way. 

Theorem 7.2. Let {A n = [fly(n)]} n be a sequence of n x n real symmetric matrices, (cci)igN a, sequence of 
i.i.d. K -dimensional real random vectors, with K[x,i] = 0, K[xixJ] = ("fij), 1 < i, j < K, and E[||xi|| 4 ] < oo. 
Let the vectors {X(£)}i<e<K be as defined in (7.1). Assume the following limits exist 



1 " 
lim - YVjr 

n — > 11 * 



n— >oo rl 



hm -trAi. 
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Then, the M -dimensional random vectors 



Z n = (Z n j), Z n j = ^[X(£) T A n X(t)- lu tTA n }, 1<£<K, 



(7.5) 



converge weakly to a zero-mean Gaussian vector with covariance matrix D = D\ + D 2 where 
D 1 = cjiEixj^} - 7«7«'). 1<1,?<K, 
D 2 = (6- uX-ywlet + -&>), \<l,H<K. 



(7.6) 



If all the diagonal elements of the matrices (A n ) are null, then cu = 0. The limiting covariance matrix D 
takes a much simpler form: 

D = 6{ 1U Hi>i+lu>)i l<iJ'<K. 

For the general complex case, we need a special device. Write Z n = U„ + iV n . Following Theorem 7.1, 
(U n ,V„) converges weakly to a 2A'-dimcnsional Gaussian vector with some covariance matrix _T. The aim 
is to identify r. We have 



Eexp[i T [/„ + s T V n ] -> exp 



teR K ,seR K . 



(7.7) 



On the other hand, from U n = \{Z n + Z n ) and V n = -^{Zn — Z n ), we have a second expression 



Ecxp[t T t/„ + s T V n ] = Eexp 



2 2i 



7T + ^7 Z n +[--- \ Z n 



2 2i 



Interestingly enough, the last transform can be found by application of Theorem 7.1 to the random sesquilin- 
car forms 



Zn 



Zjl 

Z n 



(7.8) 



For ease of the presentation, we need to define more limiting quantities. For 1 < £ < K . let a\ 
E[|a;£i| 2 ], cry t = E[|?/fi| 2 ] . We introduce the following matrices 

Bi a =uj(E[\xii\ 2 \xi> 1 \ 2 } -o- 2 xe ax e ,), 
B lb = uj(E[\y ei \ 2 \y i > 1 \ 2 } -Oy^cry^,), 
B 2a = (e-cj)(\E[x ei x tl }\ 2 ) 7 

B 2b = (6-uj)(\E[ynyi>i}\ 2 ), 
B 3a = {T-cj)(\E[x ei xe, 1 }\ 2 ) 7 
B 3b = (r-Lj)(\E[y ei y e , 1 }\ 2 ). 

Here, the indices are 1 <£,£'< K . By definition, all these matrices are real and symmetrical. Let us also 
define the 2K x 2K matrices 



(7.9) 



Bj = 



Bj Bj a 

B jb Bj 



, J = 1,2,3. 



(7.10) 
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Theorem 7.3. Consider the M -dimensional complex-valued random vectors Z n = {Z n ^) defined in Theo- 
rem 7.1. Under the the same conditions as in that theorem, the real and the imaginary parts (U n ,V n ) of Z n 
converge weakly to a 2K -dimensional Gaussian vector with covariance matrix 

£)• C7.ll) 

with 

3 



i=i 



\Y J {2^{B 3 ) + B ]a + B ]b }, 
i=i 

1 3 

r 22 = -J2{-^(B j ) + B ja + B jb }, 

3=1 

ri2 = lJ2^)- 



3=1 



Proof. For the vector of sesquilincar forms Z„ in (7.8), one can check that the limiting matrix B in 
Theorem 7.1 is to be replaced by 



3 

j = 



Then following this theorem, for c = (^- + 4jj-, ^ — lu") T > 

w rT?i Jl/t T * T t T s T Wi/2 + S /(2i) 

By identifying this formula to Eq. (7.7), we get the required form of T. □ 
7.1. Proof of Theorem 7.1 

It is sufficient to establish the CLT for the sequence of linear combinations of random Hcrmitian forms 

K 



Y,ciX(e)*A n Y(£), 
i=i 

where the coefficients (q) £ C K are arbitrary. Notice that E[X (£)* A n Y (£)] = p(l)tx(A n ), where p(£) = 
E[xaya}- 

First, by a classical procedure of truncation and renormalization (see Section 7.2 for details), we can, 
without loss of generality, assume that there is a sequence e n J, such that 

l<i<n, |N| V foil < £„n 1/4 . (7.12) 

We will use the method of moments. Define, while dropping the index n in the coefficients of A n , 



1 K 1 

= -= V ct[X(t)*A n Y(l) - p(i) txA n ] = -= V a e ^e, 
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where e = (u,v) G {1, . . . , n} 2 and 



Efci c ^«?A" e=(u,v),u^v. 



Let A: > 1 be a given integer. We have 

n k/2 C = ° e i ■ ■ ■ ae " V'ei • • • V'e* • 

ei,...,e fc 

To each term in the sum we associate a directed graph G by drawing an arrow u — > w for each factor 
ej = (it, u). The set of vertices is then a subset of {1, . . . , n}. Therefore, to a loop u — > u corresponds the 
product a uu ipu^u(£) = o, uu ^2f =1 ci[x u (£)y u (£) — p(£)] and to an edge u — > v with u^v corresponds the 
product a uv tp u ^ v = a uv Y,f=i cg.x u {l)y v (l). In other words, 



n k/2 f k 



£n = E "gV'g, a G = Y[ a e , IpG = J| Ipe- 

G eeG e£G 

We now consider the collection of connected sub-graphs of G. These connected sub-graphs can be classified 
into two types. 

• Type-I sub-graphs. We call C a Type-I connected sub-graph of G if C contains loops only. In particular C 
has a unique vertex. The set of all the m\ Type-I connected sub-graphs is denoted by T\, and the degrees 
of their vertexes by fit, . . . , fi mi , respectively. 

If fj,j = 2 for some vertex j in a sub-graph C, then EogV'G = because of independence. Therefore 
we need only consider those graphs G whose mi Type-I sub-graphs have all their vertices with degrees 
fij > 4. The contributions from all these sub-graphs to the moment part i/)g are then bounded by 



e n i> c 



<i^ £n n 1/4 ) E « m =i (w ~ 4) - (7.13) 



• Type-II sub-graphs. A connected sub-graph containing at least one arrow u — > v with u v is called a 
Type-II sub-graph. The set of all these components is denoted by Ti- For each C s £ F2, let u s be the 
number of its vertices whose degrees arc denoted by 7^, j = 1, . . . ,tt s . As in Type-I, we can also omit the 
case where jj s = 1 for some vertex j. Contributions from all the m.2 Type-II components to tpc are then 
bounded by 



e n ^ 

c s er 2 



< A'(£„n 1 / 4 )^™ 2 1 ^= 1 (73S ~ 2) . (7.14) 



Combining (7.13) and (7.14) by noticing the relation J2i (M + Ej s 7j'» = ^> the overall contribution from 
random variables has a bound 

iev^gI < K^nV^E^^-^+EriE;:,^-^) = ^^1/^-^-2^2^, (7 15) 

Next the estimation of the weight part a G will be established. Since Ej=i \ a jj\ w = O(n) holds for any 
positive integer w, thus 

mi mi / n \ 

E II w= E ni°««r^n £i°«i" /a U*» mi - ( 7 - 16 ) 

si,—,s mi Cefi si,...,s mi i=l s=l \fc=l / 

For a given Type-II component C s with t s edges, ei, . . . , et g and u s vertices, v\ , . . . , w Ua , we extract a spanning 
tree from C s and assume its edges are ei, . . . , e Us _i, without loss of generality. However, we need to distinguish 
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two situations for the remaining sub-graph after extraction of the spanning tree. Let p(A n ) be the spectral 
norm of the matrix A n . 

Case 1. The remaining sub-graph has at least one edge u — > v with u^w. Note that 



Vl 

This, via induction, implies that we have for the tree part 



(7.17) 



E IlM 2 <p(A0 2t 

Vi ,...,u Us j=l 

and for the remaining sub-graph 



E f[K\ 2 <p(A n ) 



2t s -2u s +2 u s -l 



In the second inequality above, we use the fact that t s > u s as all degrees of vertex of Type-II are no less 
than 2. It follows that 



u.-l 



1/2 



E IlKi< E II M 2 E II m 2 <p« 

vi,...,v Us j — 1 \vi,...,v Us j — 1 V\,...,V Us j—U s / 

which gives, together with (7.16), that 

em=e n n kcj<^/ 2 +-+«-/ 2 +-. 

Combining (7.15) and (7.19), we obtain 



U n u s /2 



(7.18) 



(7.19) 



,-fe/2 



EE «G^G < n- fe /^ | aG || E ^ G | < ^-^(,^1/4)^-4^,-2^-^ £ 



2fe-4mi-2y^ m2 ■ 



(7.20) 



Case 2. The remaining sub-graph does not contain any edge u — > v with u ^ v, i.e., all remaining edges 
are loops. Since the degree of each vertex of a Type-II component is no less than two, there must exist at 
least two vertices whose degrees are more than two. Thus (7.14) turns into 



e n ^ s 

C S G^2 



(7.21) 



We now need to consider two possibilities. 

(a) If all vertices of a connected sub-graph have a loop, then similar to (7.18), we have 



E IlKi<( E nW E TlwA < P {A ri 

v\ ,...,v U3 j—1 \vi,...,v Us j — 1 vi,...,v Us j—u s / 

and then (7.19) becomes 

G 



,t. n (t..+i)/a 



(7.22) 
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But, at this point, there must exist a vertex such that its degree exceeds three and so (7.21), correspondingly, 
changes into 



e n i> c 

C<£F 2 



<K(e n n 1 ^) 



By (7.22) and (7.23), similar to (7.20), we get 



-fe/2 



G 



„ i n '2k — 4m i —2 777,2 u — 4rri2 / n 

< i^n"" 12 / 2 ^ ^ s=1 <Kn- m2 ' 2 . 



(7.23) 



(7.24) 



The last inequality results from the fact that by construction, the exponent of e n is nonnegativc. Conse- 
quently, the contributions from such graphs can be neglected. 

(b) If not all vertices of a connected sub-graph have a loop, then 



Vi t ...,V U3 j — 1 

and, correspondingly, (7.19) becomes 
E \aa\ < Kn u ^ 2+ - +u ^/ 2+mi . 



(7.25) 



To see it, as an example, we consider the following 



U2 V3 u 2 V2 ^^3^3 



where 



E* 

"2 ,"3 

= 0(n 3 / 2 ), 



< 



"2 ,"3 



1/2 , \ 1/2 / 

E 12 "3 "3 J ( E fl "2 V 2 

V3 v 2 



For general cases, we can verify the order by induction. Using (7.13), (7.21) and (7.25), similar to (7.20), we 
get 



-fe/2 



G 



a G tpG 



2fc-4mi-2y^*™ 2 n,-2m, 



<Kn- m2/2 . 



(7.26) 



So the contribution from this kind of graph can also be neglected. Here we remind the reader that (7.20) is 
obtained by assuming all 777,2 Type-II components belonging to case 1 and that (7.24) or (7.26) holds if all 
iri2 Type-II components belong to case 2. If some Type-II components of the graph G belong to case 1 and 
the others pertain to case 2, by a similar argument to the above, one can show that 



-fe/2 



EE o-g^g 



0(1). 



(7.27) 



Therefore, if some item involves the connected sub-graph of case 2, the contribution from this item can then 
be omitted. 

In summary, in conjunction with (7.20) and the meanings of 2k, Ami, 2 y~] s _f 1 u s , we know that the graphs 
leading to a non negligible term are as follows: then degrees of vertices of all its Type-I components must be 
four; its Type-II components all fall into case 1 such that all its vertices arc of degree two. More precisely, 
we know that only the following isomorphic classes give a dominating term: 
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• ki double loops u—>u with terms a^„E^Li c l{ x tuVtu ~ /°M)] 2 ! 

• k 2 simple cycles u — > v, v — > u with terms \a U v\ 2 [Y,f=i^e u y£v][Yli=i^evyeu 

• k 3 double arrows u — > v,u — -> v with terms c ^£uUh] 2 - 

In addition, the degrees of vertices satisfy 
4(fci + fc 2 + fc 3 ) = 2fc, 

which implies that k must be even. Therefore, let k = 2p be an even integer. We notice that here, the 
relations on the edges, namely 2(fci + fc 2 4- & 3 ) = ^> hold automatically. Thus, we can claim that 



2p _ 



1 E 



(2p)\ 



nP ^ ki\k 2 lk 3 \ 

ki +k 2 +k 3 =k 



x Ci x C 2 x C 3 + o(l), 



where 



3=1 

k 2 

c 2 = J]e 

i=i 

c 3 1{ 



A" 



a «i«j 1 ^2^(x iuj y euj - p(£)) 
U=i 



{u J -}c{l,2,...,n}, 
{%,^} C {l,2,...,n}, 
{uj,Vj}c{l,2,...,n}. 



Let 



n 1 = E 



n 2 =E 



a 3 =E 



4=1 



By (7.20) or (7.27) again, along with inclusion-exclusion principle, (7.28) turns into 



E# = -(2p-l)!! 



E 



(p)i 



fci!fc 2 !fc 3 ! 

k!+k 2 +k 3 =k Z J (il,j 2 ,i3) = (l,l,l) 



(fcl,fc 2 ,fc3) 

nki k 2 k 3 2 I | 2 2 1 /i\ 

"1 a 2 a 3 % lUjl |a Uj2 ^ 2 | +o(l) 



1 / ™ 

-(2p-l)l! ai £ 

\ It— 1 



0«U + "2 



E 



a U v\ 2 + a 3 J2 a l vj +o(l). 



7. ,2. Truncation 



(7.28) 



(7.29) 
(7.30) 
(7.31) 



The truncation and renormalization under the fourth-moment condition is by now standard, see e.g. [3]. For 
our purpose and for case of presentation, wc give full details in the case of K = 1. The general case goes 
through in a same manner. 
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We aim to the replacement of the entries of x and y with truncated, centralized, normalized variables. 
Let x = {x\, . . . , x n ) and x = {x\, . . . ,x n ) , where 

Xj — Xjl(\xi\ < Enn 1 / 4 ), Xj=Xj— 1 Exj, j — l,...,n. 

Since E|xi| 4 < oo, for any e > 

nPflxil >en 1/4 )^0, 

and then, because of the arbitrariness of e, there exists a positive sequence e n such that 

raP(|xi| >£„?i 1/4 )^0 and e n -> 0. 
It follows that 

F{x*A n y £ x*A n y) < n¥{\x x \ > e„n 1/4 ) = o(l). (7.32) 
For h = 1, 2, . . . , find rih(nh > nh-i), for all n> rih with 

h 12 



-h 



\ Xl \>^i/h 

Let p n = j- for all n £ [rih, n-h+i] i thus, as n — > oo, p n — > and 



|xi|> ^npn 



(7.33) 



Now, for each n, let j n be the larger of p n and e„. However, in the following we still use the notation e n 
instead of 7„ . By Markov inequality and Burkholder inequality 



\x*A n y - x*A n y\ >5)< |E[xi/(|xi| < e n n 



1/4^1,4 



E|Er=ii/i(ELi^)|- 



^ifEtlari^/daiil^enn 1 / 4 )]**-^- 13 
<ifE|a; 1 | 4 7(| a;i |> en n 1 /4) e -i2 ) 

where we have used the inequality 



S 4 

EjU I S"=i a 3 
s 4 



1 21 2 



(7.34) 



E 



! = 1 

and the fact 



E a ^ 

3=1 



< 



E 



E* 



2 n 2 



E 



E a J' 

i=i 



O(n) 



From (7.34) and (7.33) we have 

x*A n y-x*A n y^0. (7.35) 

Next, we need to normalize the truncated variables Hi's. It is evident that lim n _»oo E|xi | 2 = 1 and that 

|1 - v/^iTFl < |l-E|xi| 2 | < 2E[|x- 1 | 2 /(| a ; 1 | > e n #n)] 

< 2e- 2 7i- 1 / 2 E[| a; i| 4 /(| a; i| > En^n)], (7.36) 
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which, together with (7.33), gives that 
trA 



(1- VEj^p)- 



<2e- 2 E|x 1 | 4 /(| 2 ; 1 |>e n ^)^0. (7.37) 
Combining (7.32), (7.35) and (7.37), it is now sufficient to consider 

i / yj^ 

instead of -^(x* A n y — ptr A n ). Moreover, it is not difficult to see that we can substitute 

p' = cov(ii/-y/E|xi | 2 , yi) for p without alternating the weak limit. 

The truncation, centralization and normalization of y can be completed with a similar argument as 
above. In the sequel, for simplicity, we shall suppress all superscripts on the variables and suppose that 
\xi\ <e„n 1/4 , \y t \ < e„n 1/4 , Ext = Ey t = 0, E\x. L \ 2 =E\y t \ 2 = 1, E|xi| 4 < oo, E|y,| 4 < oo and we still denote 
by p the covariance between the transformed variables. 
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